# Multi-round Visual Dialogue
Internlm Xcomposer2 4khd 7b
Other
InternLM-XComposer2-4KHD is a general visual language large model based on InternLM2, with the ability to understand 4K resolution images.
Text-to-Image
Transformers

I
internlm
1,180
73
Cogagent Vqa Hf
Apache-2.0
CogAgent is an open-source vision-language model based on CogVLM, focusing on single-round visual question answering tasks
Text-to-Image
Transformers English

C
THUDM
238
49
Cogagent Chat Hf
Apache-2.0
CogAgent is an open-source vision-language model based on CogVLM improvements, featuring GUI agent capabilities, multi-round visual dialogue, and visual grounding.
Text-to-Image
Transformers English

C
THUDM
503
69
Featured Recommended AI Models